Lecture 2: Introduction to the modelling landscape

Irena Papst

Guiding questions

🎯 What is a model?

🎯 What makes a model “good”?

🎯 What is the process of modelling?

🎯 What types of models are there?

🎯 What is a model?

These are all models

\[ y(t) = ke^{rt} \]

\[ Y = \beta_1X + \beta_0 \]

What do all of these things have in common?

All models are representations of reality

What might each of these models represent?

\[ y(t) = ke^{rt} \]

\[ Y = \beta_1X + \beta_0 \]

🎯 What makes a model “good”?

Model quality depends on model purpose


All models are wrong, but some are useful.

George E. P. Box (statistician, mid-1900s)

Model quality depends on model purpose


All models are wrong, but some are useful.

George E. P. Box (statistician, mid-1900s)

Remember that all models are wrong; the practical question is how wrong do they have to be to not be useful.


How much error can you tolerate in your model? Still depends on purpose!


Also, which resources do you have at your disposal?

Example: How long would it take me to walk from my (old) office at McMaster to a lecture hall?

First pass: low resolution

Suppose all I have at my disposal is this campus map:

I know it takes me 10 minutes to walk from my old office to the parking lot N.

Visually estimating the relative distances, I guess it would take me 5 minutes to walk to JHE.

Second pass: higher resolution

Suppose I now have more time and/or resources and I’m able to get (most of) the path I’d usually take measured:

distance = 350 m

average walk speed of an adult = 5 km/h or about 83 m/min

time = distance / speed

estimated time from HH to JHE: 4.2 min or 4 min, 12 seconds

Second pass: higher resolution

Suppose I now have more time and/or resources and I’m able to get (most of) the path I’d usually take measured:

Note: just because we can get a level of precision of 4 mins and 12 seconds out of our model, doesn’t mean we should confuse this for accuracy! There are several uncertainties that can affect our estimate… Can you name some?

Second pass: higher resolution

Suppose I now have more time and/or resources and I’m able to get (most of) the path I’d usually take measured:

May not know the size of errors induced by assumptions under uncertainty, but can at last think about potential effects on estimates (e.g. increase or decrease walking time).

🎯 What is the process of modelling?

Modeling is mapping between the real world and a mathematical and/or statistical framework

It’s an iterative process, and the final mapping depends on the acceptable level of error.

🎯 What types of models are there?

\[ y(t) = ke^{rt} \]

\[ Y = \beta_1X + \beta_0 \]

How could we split these models into two different groups?

mechanistic (rule-based): \[ y(t) = ke^{rt} \]

phenomenological (descriptive):

\[ Y = \beta_1X + \beta_0 \]

Phenomenological models

\[ Y = \beta_1X + \beta_0 \]

  • describe the state of a system at one or various points in time
  • trying to capture what the system looks like, not trying to explain why it looks like that

Phenomenological models

\[ Y = \beta_1X + \beta_0 \]

  • don’t necessarily need any domain-specific knowledge or context to formulate
  • can use phenomenological models for prediction (e.g. extrapolation), but without any context of how the data are generated, we could be badly wrong about the observed relationship in the region we seek to predict

Mechanistic models

\[ y(t) = ke^{rt} \]

  • try to come up with rules that explain the observed behaviour of a system
  • capture how the system behaves (why it looks the way it does at various points in time)
  • need domain-specific knowledge and context

10.1016/j.cois.2016.07.006

Phenomenological 🤝 Mechanistic

  1. gather descriptions (work with phenomenological models)
  1. propose a mechanism
  1. generate predictions
  1. compare to original descriptions & gather new ones
  1. (if predictions =/= descriptions) propose a new mechanism
  1. repeat 3-5 as needed

Phenomenological 🤝 Mechanistic

10.1371/journal.pclm.0000226

Other model characterizations

  • Dynamic or static
    • Is there a time component? dynamic
    • Otherwise, static
  • Discrete or continuous
    • in state: categorical or numerical variables
    • in time: \(y_{t+1} = c y_t\) vs \(dy/dt = cy\)
    • Simulations are discrete (discretized) in time!
  • Deterministic or stochastic
    • Is there noise incorporated in the model?

Summary

🎯 What is a model?

  • A representation of reality

🎯 What makes a model “good”?

  • Depends on the purpose of the model

Summary

🎯 What is the process of modelling?

Summary

🎯 What types of models are there?

  • phenomenological or mechanistic
  • dynamic or static
  • discrete or continuous
  • deterministic or stochastic